Speeding up the Convergence of Real-Time Search: Empirical Setup and Proofs

نویسندگان

  • David Furcy
  • Sven Koenig
چکیده

This technical report contains the formal proofs for all of our theoretical results, as well as a description of our experimental setup for all of the results given in our AAAI-2000 paper entitled Speeding up the Convergence of Real-Time Search. In that paper, we propose to speed up the convergence of real-time search methods such as LRTA*. We show that LRTA* often converges significantly faster when it breaks ties towards successors with smallest f-values (à la A*) and even faster when it moves to successors with smallest f-values instead of only breaking ties in favor of them. FALCONS, our novel real-time search method, uses a sophisticated implementation of this successor-selection rule and thus selects successors very differently from LRTA*, which always minimizes the estimated cost to go. Our approach opens up new avenues of research for the design of novel successor-selection rules that speed up the convergence of both real-time search methods and reinforcement-learning methods. Indeed, our AAAI-2000 paper presents experiments in which FALCONS finds a shortest path up to sixty percent faster than LRTA* in terms of action executions and up to seventy percent faster in terms of trials. In this report, we first describe our experimental setup and then prove that FALCONS terminates and converges to a shortest path.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speeding up the Convergence of Real-Time Search

Learning Real-Time A* (LRTA*) is a real-time search method that makes decisions fast and still converges to a shortest path when it solves the same planning task repeatedly. In this paper, we propose new methods to speed up its convergence. We show that LRTA* often converges significantly faster when it breaks ties towards successors with smallest f-values (a la A*) and even faster when it move...

متن کامل

Value Back-Propagation versus Backtracking in Real-Time Heuristic Search

One of the main drawbacks of the LRTA* real-time heuristic search algorithm is slow convergence. Backtracking as introduced by SLA* is one way of speeding up the convergence, although at the cost of sacrificing first-trial performance. The backtracking mechanism of SLA* consists of back-propagating updated heuristic values to previously visited states while the algorithm retracts its steps. In ...

متن کامل

Prioritized-LRTA*: Speeding Up Learning via Prioritized Updates

Modern computer games demand real-time simultaneous control of multiple agents. Learning real-time search, which interleaves planning and acting, allows agents to both learn from experience and respond quickly. Such algorithms require no prior knowledge of the environment and can be deployed without pre-processing. We introduce PrioritizedLRTA*, an algorithm based on Prioritized Sweeping. This ...

متن کامل

A fuzzy mixed-integer goal programming model for a parallel machine scheduling problem with sequence-dependent setup times and release dates

This paper presents a new mixed-integer goal programming (MIGP) model for a parallel machine scheduling problem with sequence-dependent setup times and release dates. Two objectives are considered in the model to minimize the total weighted flow time and the total weighted tardiness simultaneously. Due to the com-plexity of the above model and uncertainty involved in real-world scheduling probl...

متن کامل

Integrated JIT Lot-Splitting Model with Setup Time Reduction for Different Delivery Policy using PSO Algorithm

This article develops an integrated JIT lot-splitting model for a single supplier and a single buyer. In this model we consider reduction of setup time, and the optimal lot size are obtained due to reduced setup time in the context of joint optimization for both buyer and supplier, under deterministic condition with a single product. Two cases are discussed: Single Delivery (SD) case, and Multi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000